CLAIMS: 

We claim: 

1 . A method for autonomically diagnosing and correcting error conditions in a 
computing system of interrelated components and resources, the method comprising 
the steps: 

for each one of the components, reporting error conditions in a log file using both 
uniform conventions for naming dependent ones of the interrelated components and 
resources and also a common error reporting format; 

detecting error conditions arising from individual ones of the interrelated 
components; 

responsive to detecting an error condition in a specific one of the components, 
parsing a log associated with said specific one of the components to determine whether 
said error condition arose from a fault in one of the interrelated components and 
resources named in said associated log, and further parsing a log associated with said 
one of the interrelated components and resources to identify a cause for said fault; and, 

correcting said fault. 

2. The method of claim 1 , further comprising the steps of: 

inserting analysis code in said specific one of the components responsive to 
detecting said error condition, said ianalysis code having a configuration for reporting 
operational data associated with said error condition; and, 

utilizing said reported operational data to identify a cause for said error condition. 
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3. The method of claim 1 , further comprising the steps of: 

activating dormant analysis code in said specific one of the components 
responsive to detecting said error condition, said dormant analysis code having a 
configuration for reporting operational data associated with said error condition; and, 

utilizing said reported operational data to identify a cause for said error condition. 

4. The method of claim 1 , further comprising the steps of: 

inserting analysis code in both said specific one of the components and said one 
of the interrelated components and resources responsive to detecting said error 
condition, said analysis code having a configuration for reporting operational data for 
said specific one of the components and said one of the interrelated components and 
resources; and, 

utilizing said reported operational data to correlate error conditions in each of 
said specific one of the components and said one of the interrelated components and 
resources to identify a cause for said error condition. 

5. The method of claim 1 , further comprising the step of inserting analysis code in 
said specific one of the components responsive to detecting said error condition, said 
analysis code having a configuration for suspending the operation of said specific one 
of the components pending resolution of said error condition. 

6. The method of claim 1 , wherein said correcting step comprises the steps of: 
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determining from said further parsing step whether said fault in said one of the 
interrelated components and resources named in said associated log arose from an 
additional fault in yet another one of the interrelated components and resources; and, 

repeating each of the parsing and correcting steps for said yet another 
interrelated one the components and resources. 

7. An autonomic system for diagnosing and correcting error conditions among 
interrelated components and resources comprising: 

a plurality of commonly formatted log files utilizing standardized naming 
conventions for the interrelated components and resources, each of said commonly 
formatted log files having an association with one of the interrelated components and 
resources; and, 

an autonomic system administrator coupled to each of the interrelated 
components and resources and configured to parse said log files to identify both error 
conditions arising in associated ones of the interrelated components and resources, 
and also dependent ones of the interrelated components and resources giving rise to 
the identified error conditions. 

8. The autonomic system of claim 7, further comprising: 
a codebase of analysis code; and, 

code insertion logic coupled to said autonomic system administrator and 
programmed to insert portions of said analysis code in selected ones of the interrelated 
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9. The autonomic system of claim 8, wherein said analysis code comprises byte 
code and wherein said code insertion logic comprises byte code insertion logic. 

10. A machine readable storage having stored thereon a computer program for 
autonomically diagnosing and correcting error conditions in a computing system of 
interrelated components and resources, the computer program comprising a routine set 
of instructions for causing the machine to perform the steps: 

for each one of the components, reporting error conditions in a log file using both 
uniform conventions for naming dependent ones of the interrelated components and 
resources and also a common error reporting format; 

detecting error conditions arising from individual ones of the interrelated 
components; 

responsive to detecting an error condition in a specific one of the components, 
parsing a log associated with said specific one of the components to determine whether 
said error condition arose from a fault in one of the interrelated components and 
resources named in said associated log, and further parsing a log associated with said 
one of the interrelated components and resources to identify a cause for said fault; and, 
correcting said fault. 

1 1 . The machine readable storage of claim 10, further comprising the steps of: 
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inserting analysis code in said specific one of the components responsive to 
detecting said error condition, said analysis code having a configuration for reporting 
operational data associated with said error condition; and, 

utilizing said reported operational data to identify a cause for said error condition. 

12. The machine readable storage of claim 10, further comprising the steps of: 
activating dormant analysis code in said specific one of the components 

responsive to detecting said error condition, said dormant analysis code having a 
configuration for reporting operational data associated with said error condition; and, 

utilizing said reported operational data to identify a cause for said error condition. 

13. The machine readable storage of claim 10, further comprising the steps of: 
inserting analysis code in both said specific one of the components and said one 

of the interrelated components and resources responsive to detecting said error 
condition, said analysis code having a configuration for reporting operational data for 
said specific one of the components and said one of the interrelated components and 
resources; and, 

utilizing said reported operational data to correlate error conditions in each of 
said specific one of the components and said one of the interrelated components and 
resources to identify a cause for said error condition. 

14. The machine readable storage of claim 10, further comprising the step of 
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inserting analysis code in said specific one of the components responsive to detecting 
said error condition, said analysis code having a configuration for suspending the 
operation of said specific one of the components pending resolution of said error 
condition. 

15. The machine readable storage of claim 10, wherein said correcting step 
comprises the steps of: 

determining from said further parsing step whether said fault in said one of the 
interrelated components and resources named in said associated log arose from an 
additional fault in yet another one of the interrelated components and resources; and, 

repeating each of the parsing and correcting steps for said yet another 
interrelated one the components and resources. 
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